Idiom-Aware Compositional Distributed Semantics

نویسندگان

  • Pengfei Liu
  • Kaiyu Qian
  • Xipeng Qiu
  • Xuanjing Huang
چکیده

Idioms are peculiar linguistic constructions that impose great challenges for representing the semantics of language, especially in current prevailing end-to-end neural models, which assume that the semantics of a phrase or sentence can be literally composed from its constitutive words. In this paper, we propose an idiomaware distributed semantic model to build representation of sentences on the basis of understanding their contained idioms. Our models are grounded in the literalfirst psycholinguistic hypothesis, which can adaptively learn semantic compositionality of a phrase literally or idiomatically. To better evaluate our models, we also construct an idiom-enriched sentiment classification dataset with considerable scale and abundant peculiarities of idioms. The qualitative and quantitative experimental analyses demonstrate the efficacy of our models. The newly-introduced datasets are publicly available at http: //nlp.fudan.edu.cn/data/

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards Syntax-aware Compositional Distributional Semantic Models

Compositional Distributional Semantics Models (CDSMs) are traditionally seen as an entire different world with respect to Tree Kernels (TKs). In this paper, we show that under a suitable regime these two approaches can be regarded as the same and, thus, structural information and distributional semantics can successfully cooperate in CSDMs for NLP tasks. Leveraging on distributed trees, we pres...

متن کامل

Idiom Token Classification using Sentential Distributed Semantics

Idiom token classification is the task of deciding for a set of potentially idiomatic phrases whether each occurrence of a phrase is a literal or idiomatic usage of the phrase. In this work we explore the use of Skip-Thought Vectors to create distributed representations that encode features that are predictive with respect to idiom token classification. We show that classifiers using these repr...

متن کامل

Minimum Description Length and Compositionality

In [12] we have shown that the standard de nition of compositionality is formally vacuous; that is, any semantics can be easily encoded as a compositional semantics. We have also shown that when compositional semantics is required to be "systematic", it is possible to introduce a non-vacuous concept of compositionality. However, a technical de nition of systematicity was not given in that paper...

متن کامل

The Essence of Form Abstraction

Abstraction is the cornerstone of high-level programming; HTML forms are the principal medium of web interaction. However, most web programming environments do not support abstraction of form components, leading to a lack of compositionality. Using a semantics based on idioms, we show how to support compositional form construction and give a convenient syntax.

متن کامل

Compositional Syntax and Semantics of Tables

Parnas together with a number of colleagues established the systematic use of certain kinds of tables as a useful tool in software documentation and inspection with an accessible, multidimensional syntax and intuitive semantics. Previous approaches to formalisation of table semantics based their definitions on the multi-dimensional array structure of tables and thus achieved close correspondenc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017